Asymmetric and Sample Size Sensitive Entropy Measures for Supervised Learning
نویسندگان
چکیده
Many algorithms of machine learning use an entropy measure as optimization criterion. Among the widely used entropy measures, Shannon’s is one of the most popular. In some real world applications, the use of such entropy measures without precautions, could lead to inconsistent results. Indeed, the measures of entropy are built upon some assumptions which are not fulfilled in many real cases. For instance, in supervised learning such as decision trees, the classification cost of the classes is not explicitly taken into account in the tree growing process. Thus, the misclassification costs are assumed to be the same for all classes. In the case where those costs are not equal on all classes, the maximum of entropy must be elsewhere than on the uniform probability distribution. Also, when the classes don’t have the same a priori distribution of probability, the worst case (maximum of the entropy) must be elsewhere than on the uniform distribution. In this paper, starting from real world problems, we will show that classical entropy measures are not suitable for building a predictive model. Then, we examine the main axioms that define an entropy and discuss their inadequacy in machine learning. This we lead us to propose a new entropy measure that possesses more suitable proprieties. After what, we carry out some evaluations on data sets that illustrate the performance of the new measure of entropy.
منابع مشابه
Detecting Concept Drift in Data Stream Using Semi-Supervised Classification
Data stream is a sequence of data generated from various information sources at a high speed and high volume. Classifying data streams faces the three challenges of unlimited length, online processing, and concept drift. In related research, to meet the challenge of unlimited stream length, commonly the stream is divided into fixed size windows or gradual forgetting is used. Concept drift refer...
متن کاملA New Formulation for Cost-Sensitive Two Group Support Vector Machine with Multiple Error Rate
Support vector machine (SVM) is a popular classification technique which classifies data using a max-margin separator hyperplane. The normal vector and bias of the mentioned hyperplane is determined by solving a quadratic model implies that SVM training confronts by an optimization problem. Among of the extensions of SVM, cost-sensitive scheme refers to a model with multiple costs which conside...
متن کاملUnsupervised Model Adaptation using Information-Theoretic Criterion
In this paper we propose a novel general framework for unsupervised model adaptation. Our method is based on entropy which has been used previously as a regularizer in semi-supervised learning. This technique includes another term which measures the stability of posteriors w.r.t model parameters, in addition to conditional entropy. The idea is to use parameters which result in both low conditio...
متن کاملPac-bayesian Inductive and Transductive Learning
We present here a PAC-Bayesian point of view on adaptive supervised classification. Using convex analysis on the set of posterior probability measures on the parameter space, we show how to get local measures of the complexity of the classification model involving the relative entropy of posterior distributions with respect to Gibbs posterior measures. We then discuss relative bounds, comparing...
متن کاملThe Impact of Asymmetric Risk on Expected Return
The main goal of the present study is testing asymmetric risk pricing and comparing it with pricing of traditional risk measures in Tehran Stock Market. Accordingly, a sample consisting of 101 companies listed in Tehran Stock Market during 2002-2013 went under investigation. In order to test asymmetric risk pricing, regression model of panel data was applied. The results revealed a positive and...
متن کامل